Coherence Controller Architectures for Scalable Shared-Memory Multiprocessors

نویسندگان

  • Maged M. Michael
  • Ashwini K. Nanda
  • Beng-Hong Lim
چکیده

ÐScalable distributed shared-memory architectures rely on coherence controllers on each processing node to synthesize cache-coherent shared memory across the entire machine. The coherence controllers execute coherence protocol handlers that may be hardwired in custom hardware or programmed in a protocol processor within each coherence controller. Although custom hardware runs faster, a protocol processor allows the coherence protocol to be tailored to specific application needs and may shorten hardware development time. Previous research shows minimal increase in application execution time due to protocol processors over custom hardware. With the advent of SMP nodes and faster processors and networks, the trade-off between custom hardware and protocol processors needs to be reexamined. This paper studies the performance of custom hardware and protocol-processor-based coherence controllers in SMP-node-based CC-NUMA systems on applications from the SPLASH-2 suite. Using realistic parameters and detailed models of state-of-the-art system components, it shows that the occupancy of coherence controllers can limit the performance of applications with high communication requirements, where the execution time using commodity protocol processors can be twice as long as using custom hardware. We also investigate the effect of varying several architectural parameters that influence the communication characteristics of the applications and the underlying system on coherence controller performance. We identify measures of applications' communication requirements and their impact on performance. We also study the potential of improving the performance of coherence controllers by separating or duplicating critical components. Index TermsÐCoherence controller, shared memory, multiprocessor, protocol processor.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Memory Network Shared Memory ( UMA ) Shared Memory ( NUMA )

Multiprocessors with shared memory are considered more general and easier to program than message-passing machines. The scalability is, however, in favor of the latter. There are a number of proposals showing how the poor scalability of shared memory multiprocessors can be improved by the introduction of private caches attached to the processors. These caches are kept consistent with each other...

متن کامل

Cache-Coherent Distributed Shared Memory: Perspectives on Its Development and Future Challenges

Distributed shared memory is an architectural approach that allows multiprocessors to support a single shared address space that is implemented with physically distributed memories. Hardwaresupported distributed shared memory is becoming the dominant approach for building multiprocessors with moderate to large numbers of processors. Cache coherence allows such architectures to use caching to ta...

متن کامل

Type Data tra c Replacement tra c Coherence tra cUMA

Shared-bus multiprocessors represent a mainstream of accepted and commercially viable computer systems. However, as microprocessors become faster and demand more bandwidth, the already limited scalability of shared-bus decreases even further. As an eeort, not a mutually exclusive but rather a complementary to developing better backplane bus, this paper considers adapting distributed shared-memo...

متن کامل

Eecient Implementation of Cache Coherence in Scalable Shared Memory Multiprocessors

The cache coherence scheme for a scalable distributed shared memory multiproces-sor should be eecient in terms of memory overhead for maintaining the directories, as well as network latency for a memory request. In this paper, we propose a cache coherence scheme which minimizes the memory access delay and at the same time, reduces the directory overhead by using a limited directory scheme. In t...

متن کامل

Meshes vs. Hypercubes: A case study for Distributed Shared-memory Multiprocessors

Distributed shared-memory multiprocessors (DSM) are gaining acceptance because they are easier to program than multicomputers. Recently proposed DSM use a direct interconnection network to access remote memory locations, making these architectures scalable. Most DSMs implement a cache coherence protocol by hardware. This protocol exchanges data and control messages through the interconnection n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 48  شماره 

صفحات  -

تاریخ انتشار 1999